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WHAT IS CLAIMED IS: 



5 

1. A method of recognizing a document image 
including a plurality of areas, comprising the steps of: 
a) inputting said document image as a digital 

image ; 

10 b) specifying a background color of said 

document image ; 

c) extracting a plurality of pixels located in 

areas other than a background area from said document 

image by use of said background color; 
15 d) creating a plurality of connected elements 

by combining said plurality of pixels; and 

e) classifying said plurality of connected 

elements into a plurality of fixed types of areas by 

using at least features of shapes of said plurality of 
20 connected elements to obtain an area-separated document 

image . 



25 
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2 - The method as claimed in claim 1 , further 
comprising the steps of: 

f) creating a binary image by binarizing said 
area-separated document image; 

g) classifying a plurality of areas included 
in said binary image into said plurality of fixed types 
of areas; 

h) comparing a result of the step (e) and a 
result of the step (g) ; 

i) correcting said area-separated document 
image if said result of the step (e) is not equal to 
said result of the step (g) ; and 

j) recognizing a character in a text area of 
said area-separated document image. 



3. The method as claimed in claim 1, wherein 
said step (b) includes the steps of: 

k) clustering a plurality of colors on said 
document image; and 

1) setting a representative color of a largest 
cluster obtained by the step (k) to said background 
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4. The method as claimed in claim 3, wherein 
said step (k) includes the steps of: 

sampling each of the plurality of pixels at 
regular intervals; and 

clustering the plurality of colors on said 
document image by use of a plurality of pixel values 
obtained by smoothing pixels surrounding said each of 
the plurality of pixels. 



5 . The method as claimed in claim 1 , further 
comprising the step of reducing a size of said document 
image, wherein said step of reducing the size of said 
document image includes the steps of: 

diving said document image into a plurality of 

blocks ; 

obtaining a representative color of each of 
said plurality of blocks; 

determining colors of said plurality of blocks 
after sizes of said plurality of blocks are reduced, by 
comparing said representative color and said background 
color; and 

reducing said plurality of blocks into the 



plurality of pixels having said colors. 



5. The method as claimed in claim 5, wherein 
said each of the plurality of blocks is a 3X3 or 4X4 
grating. 



7. The method as claimed in claim 1, wherein 
said step (c) includes the step of determining a focused 
pixel as a pixel located in an area other than said 
background area if a difference between three primary 
colors of said background color and said focused pixel 
is larger than a fixed value. 



8. The method as claimed in claim 1, further 
comprising the steps of: 

creating the document image, in which a figure 



or photograph rectangular area separated by said step 
(e) is painted over with a specified color; 

binarizing said document image; and 
recognizing characters on a binary image 
obtained by binarizing said document image. 



9. The method as claimed in claim 1, further 
comprising the step of recursively performing said step 
(e) to a specific rectangular area classified at said 
step (e) . 



10. A method of recognizing a document image, 
comprising the steps of: 

a) inputting said document image as a digital 

image ; 

b) performing color area separation to said 

document image; 

c) creating a binary image for each area 
separated by said color area separation; 
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d) creating a single binary image by combining 
said binary image for each area, thereby performing 
binarization to said document image; 

e) performing binary area separation to said 
5 single binary image; 

f) comparing a result of said color area 
separation and a result of said binary area separation; 
and 

g) obtaining a binary image and a result of 

10 area separation by performing a feedback process until a 
certain condition is satisfied, or for a fixed times, in 
accordance with a result of the step (f ) . 



15 

11, The method as claimed in claim 10, 
wherein said feedback process is performed in a case in 
which the certain condition is not satisfied in a range 
20 of said document image as the result of the step (f) , 
said feedback process including the steps of: 

creating an area that includes said range; 
performing said color area separation, said 
binarization and said binary area separation to said 
25 area; and 
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performing said step (f ) . 

5 

12. The method as claimed in claim 10, 
wherein said feedback process is performed in a case in 
which a text line is extracted from a range in said 
document image by one of said color area separation and 
10 said binary area separation, and a character rectangle 

including characters is not extracted from said range by 
the other, said feedback process including the steps of: 
specifying a character color of said character 

rectangle ; 

15 determining that said range includes a 

character if said character color is even throughout 
said range; 

performing said color area separation, said 
binarization, said binary area separation to said range 
20 by use of said character color; and 
performing said step (f ) , 



25 
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13. The method as claimed in claim 12, 
wherein said feedback process includes the steps of: 

creating an area including the text line in a 
case in which said text line extracted from the range by 
5 said color area separation does not exist in the result 
of the binary area separation as the result of said step 
(f) ; 

performing said binarization and said binary 
area separation to said area; and 
10 performing said step (f) . 



15 14. The method as claimed in claim 10, 

wherein said feedback process is performed in a case in 
which layout features of a fixed nixmber or more than the 
fixed number of text lines are continuously different 
between said result of the color area separation and 

20 said result of the binary area separation as the result 
of said step (f ) , said feedback process including the 
steps of: 

creating an area including said text lines; 
binarizing said area; 
25 performing said binary area separation to said 
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area; and 

performing said step (f) . 



15. The method as claimed in claim 10, 
wherein an image-division-type binarizing method is 
applied to a text area of said document image, and a 
10 discriminant analysis method is applied to ruled-line, 
figure, and photograph areas. 



15 

16. The method as claimed in claim 10, 
wherein said color area separation includes the steps 
of: 

specifying a background color of said document 

20 image; 

extracting a plurality of pixels located 
outside a background area from said document image by 
use of said background color; 

creating a plurality of connected elements by 
25 combining said plurality of pixels; and 
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classifying said plurality of connected 
elements into a plurality of fixed types of areas by use 
of at least features of shapes of said connected 
elements to obtain an area-separated document image. 



17. A document- image recognition device 
10 recognizing a document image including a plurality of 
areas , comprising: 

an input unit inputting said document image as 
a digital image ; 

a background-color specifying unit specifying 
15 a background color of said document image; 

an extracting unit extracting a plurality of 
pixels located outside a background area from said 
document image by use of said background color; 

a creating unit creating a plurality of 
20 connected elements by combining said plurality of 
pixels ; and 

a classifying unit classifying said plurality 
of connected elements into a plurality of fixed types of 
areas by use of at least features of shapes of said 
25 connected elements to obtain an area-separated document 
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image . 



5 

18. The document-image recognition device as 
claimed in claim 17, further comprising: 

a binary- image creating unit creating a binary 
image by binarizing said area-separated document image; 
10 a correcting unit classifying a plurality of 

areas included in said binary image into said plurality 
of fixed types of areas, and correcting said area- 
separated document image by comparing said area- 
separated document image with a result of classifying 
15 the plurality of areas; and 

a recognizing unit recognizing a character in 
a text area of said document image. 



20 

19. The document-image recognition device as 
claimed in claim 17, wherein said background-color 
specifying unit includes: 
25 a clustering unit clustering a plurality of 
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colors on said document image; and 

a setting unit setting a representative color 
of a largest cluster obtained by clustering said 
plurality of colors on said document image to said 
5 background color. 



10 20. The document-image recognition device as 

claimed in claim 19, wherein said clustering unit 
includes : 

a sampling unit sampling the plurality of 
pixels at regular intervals; and 
15 a cluster unit clustering the plurality of 

colors on said document image by use of a plurality of 
pixel values obtained by smoothing pixels surrounding 
said plurality of pixels. 



20 



21. The document- image recognition device as 
claimed in claim 17, further comprising a reducing unit 
25 reducing a size of said document image, wherein said 
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reducing unit includes: 

a dividing unit diving said document image 
into a plurality of blocks; 

a representative-color obtaining unit 
5 obtaining a representative color of each of said 
plurality of blocks; 

a color determining unit determining colors of 
said plurality of blocks after sizes of said plurality 
of blocks are reduced, by comparing said representative 
10 color and said background color; and 

a block reducing unit reducing said plurality 
of blocks into the plurality of pixels having said 
colors . 



15 



22. The document- image recognition device as 
claimed in claim 21, wherein each of said plurality of 
20 blocks is a 3X3 or 4X4 grating. 



25 



23. The document-image recognition device as 



-72- 



claimed in claim 17, wherein said extracting unit 
includes a pixel determining unit determining a focused 
pixel as a pixel located outside said background area if 
a difference between three primary colors of said 
5 background color and said focused pixel is larger than a 
fixed value. 



10 

24. The document- image recognition device as 
claimed in claim 17, further comprising: 

a do cxjment- image creating unit creating the 
document image, in which a figure or photograph 
15 rectangular area separated by said classifying unit is 
painted over with a specified color; 

a binarizing unit binarizing said document 

image ; and 

a character recognizing unit recognizing 
20 characters on a binary image obtained by binarizing said 
document image. 



25 
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25. The document- image recognition device as 
claimed in claim 17, recursively carrying out a process 
performed by said classifying unit to a specific 
rectangular area classified by said classifying unit. 



26. A document- image recognition device 
10 recognizing a document image, comprising; 

an input unit inputting said document image as 
a digital image; 

a color area separation unit performing color 
area separation to said document image; 
15 a binary-image creating unit creating a binary 

image for each area separated by said color area 
separation; 

a binary area separation unit creating a 
single binary image by combining said binary image for 
20 each area, thereby performing binarization to said 

document image, and performing binary area separation to 
said single binary image; 

a comparing unit comparing a result of said 
color area separation and a result of said binary area 
25 separation; and 
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an obtaining unit obtaining a binary image and 
a result of area separation by performing a feedback 
process until a certain condition is satisfied, or for a 
fixed times, in accordance with a result of comparison 
5 carried out by said comparing unit. 



10 27, The document- image recognition device as 

claimed in claim 26, wherein said feedback process is 
performed in a case in which a text line is extracted 
from a range in said document image by one of said color 
area separation and said binary area separation, and a 

15 character rectangle including characters is not 

extracted from said range by the other, said feedback 
process including the steps of: 

specifying a character color of said character 

rectangle ; 

20 determining that said range includes a 

character if said character color is even throughout 
said ranges- 
performing said color area separation, said 
binarization, said binary area separation to said range 

25 by use of said character color; and 



-75- 



performing said comparison. 



5 

28. The document- image recognition device as 
claimed in claim 27, wherein said feedback process 
includes the steps of: 

creating an area including the text line in a 
10 case in which said text line extracted from the range by 
said color area separation does not exist in the result 
of the binary area separation as the result of said 
comparison; 

performing said binarization and said binary 
15 area separation to said area; and 

performing said comparison. 



20 

29. The document-image recognition device as 
claimed in claim 26, wherein said feedback process is 
performed in a case in which layout features of a fixed 
number or more than the fixed number of text lines are 
25 continuously different between said result of the color 
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area separation and said result of the binary area 
separation as the result of said comparison, said 
feedback process including the steps of: 

creating an area including said text lines; 
5 binarizing said area; 

performing said binary area separation to said 

area; and 

performing said comparison. 

10 



30. A record medium readable by a computer, 
tangibly embodying a program of instructions executable 
15 by the computer to carry out a document- image 

recognition process, said instructions comprising the 
steps of: 

a) inputting said document image as a digital 

image ; 

20 b) specifying a background color of said 

document image; 

c) extracting a plurality of pixels located 

outside a background area from said document image by 

use of said background color; 
25 d) creating a plurality of connected elements 
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by combining said plurality of pixels; and 

e) classifying said plurality of connected 
elements into a plurality of fixed types of areas by use 
of at least features of shapes of said connected 
5 elements to obtain an area-separated document image. 



10 31. The record medium as claimed in claim 30, 

wherein said instructions further includes the steps of: 

f) creating a binary image by binarizing said 
area-separated document image; 

g) classifying a plurality of areas included 
15 in said binary image into said plurality of fixed types 

of areas; 

h) comparing a result of the step (e) and a 
result of the step (g) ; 

i) correcting said area-separated document 
20 image if said result of the step (e) is not equal to 

said result of the step (g) ; and 

j) recognizing a character in a text area of 
said area-separated document image. 



25 
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32. A record medium readable by a computer, 
tangibly embodying a program of instructions executable 
by the computer to carry out a document- image 
recognition process, said instructions comprising the 
5 steps of: 

a) inputting said document image as a digital 

image ; 

b) performing color area separation to said 
document image ; 

10 c) creating a binary image for each area 

separated by said color area separation; 

d) creating a single binary image by combining 

said binary image for each area, thereby performing 

binarization to said document image; 
15 e) performing binary area separation to said 

single binary image; 

f) comparing a result of said color area 

separation and a result of said binary area separation; 

and 

20 g) obtaining a binary image and a result of 

area separation by performing a feedback process until a 
certain condition is satisfied, or for a fixed times, in 
accordance with a result of the step (f) . 



